自然语言处理(NLP)算法正在迅速改善,但在应用于分布的示例时通常会挣扎。减轻域间隙的突出方法是域的适应性,其中在源域上训练的模型适应了新的目标域。我们提出了一种新的学习设置,``从头开始适应域名'',我们认为这对于以隐私的方式将NLP的覆盖范围扩展到敏感域至关重要。在此设置中,我们旨在有效地从一组源域中注释数据,以便训练有素的模型在敏感的目标域上表现良好,从而从中无法从中获得注释。我们的研究将这种具有挑战性的设置的几种方法比较,从数据选择和域适应算法到主动学习范式,在两个NLP任务上:情感分析和命名实体识别。我们的结果表明,使用上述方法可以缓解域间隙,并将其组合进一步改善结果。
translated by 谷歌翻译
多任务学习,其中几个任务是通过单个模型共同学习的,允许NLP模型共享来自多个注释的信息,并在任务相互关联时可以促进更好的预测。但是,这项技术需要用多个注释方案对相同的文本进行注释,这可能是昂贵和费力的。活跃学习(AL)已被证明可以通过迭代选择对NLP模型最有价值的未标记示例来优化注释过程。然而,多任务主动学习(MT-AL)尚未应用于最新的基于预训练的变压器的NLP模型。本文旨在缩小这一差距。我们在三个现实的多任务场景中探索了各种多任务选择标准,反映了参与任务之间的不同关系,并与单任务选择相比演示了多任务的有效性。我们的结果表明,可以有效地使用MT-AL,以最大程度地减少多任务NLP模型的注释工作。
translated by 谷歌翻译
大多数在对话率问题回答中建模对话历史记录(CQA)的作品报告了共同CQA基准测试的主要结果。尽管现有模型在CQA排行榜上显示出令人印象深刻的结果,但尚不清楚它们在设置方面(有时是更现实的),训练数据大小(例如从大型集合到小型集合)和域是否有牢固的变化。在这项工作中,我们设计并进行了首次针对CQA的历史建模方法的大规模鲁棒性研究。我们发现,高基准分数不一定会转化为强大的鲁棒性,并且在不同的设置下,各种方法的性能都大不相同。配备了我们研究的见解,我们设计了一种基于及时的新型历史建模方法,并在各种环境中展示了其强大的鲁棒性。我们的方法灵感来自现有方法,这些方法突出了段落中的历史答案。但是,我们不是通过修改段落令牌嵌入来突出显示,而是直接在段落文本中添加文本提示。我们的方法简单,易于插入实际上任何模型,并且非常有效,因此我们建议它作为未来模型开发人员的起点。我们还希望我们的研究和见解将提高人们对以鲁棒性评估的重要性的认识,除了获得较高的排行榜分数,从而提高了更好的CQA系统。
translated by 谷歌翻译
当代预测模型很难解释,因为他们的深网利用了输入要素之间的许多复杂关系。这项工作通过测量相关特征对网络相对于输入的功能熵的贡献,提出了模型可解释性的理论框架。我们依赖于对数 - 索波列夫的不等式,该不平等是通过功能性渔民信息与数据的协方差界定功能熵的。这提供了一种衡量特征子集对决策功能的信息贡献的原则方法。通过广泛的实验,我们表明我们的方法超过了基于图像,文本和音频等各种数据信号的现有基于基于可解释性抽样的方法。
translated by 谷歌翻译
科学研究的基本目标是了解因果关系。然而,尽管因果关系在生活和社会科学中的重要作用,但在自然语言处理(NLP)中并不具有相同的重要性,而自然语言处理(NLP)传统上更加重视预测任务。这种区别开始逐渐消失,随着因果推理和语言处理的融合,跨学科研究的新兴领域。尽管如此,关于NLP因果关系的研究仍然散布在没有统一的定义,基准数据集的情况下,并清楚地表达了将因果推论应用于文本领域的挑战和机遇,并具有其独特的属性。在这项调查中,我们巩固了整个学术领域的研究,并将其置于更广泛的NLP景观中。我们介绍了用文本估算因果效应的统计挑战,其中包含文本用作结果,治疗或解决混杂问题的设置。此外,我们探讨了因果推理的潜在用途,以提高NLP模型的鲁棒性,公平性和解释性。因此,我们提供了NLP社区因果推断的统一概述。
translated by 谷歌翻译
劝说游戏是经济学和AI研究的基础,并作为重要应用的基础。但是,在此设置上的工作假定与不包含丰富人类语言的程式化消息的通信。在本文中,我们考虑了一名重复的发件人(专家) - 接收者(决策者)游戏,发件人完全了解世界的状态,并旨在说服接收者通过发送几种可能的自然语言之一来接受贸易评论。我们设计了一个自动专家,播放这一重复游戏,旨在实现最大的回报。我们的专家在Monte Carlo树搜索(MCT)算法中实施,具有深入学习模型,用于利用行为和语言信号,以预测决策者的下一个行动,以及鉴于游戏状态的专家的未来收益和候选人审查。我们展示了我们专家的优势在强大的基线,其对不同决策者的适应性,其选择评论很好地适应了拟议的交易。
translated by 谷歌翻译
Although many studies have successfully applied transfer learning to medical image segmentation, very few of them have investigated the selection strategy when multiple source tasks are available for transfer. In this paper, we propose a prior knowledge guided and transferability based framework to select the best source tasks among a collection of brain image segmentation tasks, to improve the transfer learning performance on the given target task. The framework consists of modality analysis, RoI (region of interest) analysis, and transferability estimation, such that the source task selection can be refined step by step. Specifically, we adapt the state-of-the-art analytical transferability estimation metrics to medical image segmentation tasks and further show that their performance can be significantly boosted by filtering candidate source tasks based on modality and RoI characteristics. Our experiments on brain matter, brain tumor, and white matter hyperintensities segmentation datasets reveal that transferring from different tasks under the same modality is often more successful than transferring from the same task under different modalities. Furthermore, within the same modality, transferring from the source task that has stronger RoI shape similarity with the target task can significantly improve the final transfer performance. And such similarity can be captured using the Structural Similarity index in the label space.
translated by 谷歌翻译
In computational advertising, a challenging problem is how to recommend the bid for advertisers to achieve the best return on investment (ROI) given budget constraint. This paper presents a bid recommendation scenario that discovers the concavity changes in click prediction curves. The recommended bid is derived based on the turning point from significant increase (i.e. concave downward) to slow increase (convex upward). Parametric learning based method is applied by solving the corresponding constraint optimization problem. Empirical studies on real-world advertising scenarios clearly demonstrate the performance gains for business metrics (including revenue increase, click increase and advertiser ROI increase).
translated by 谷歌翻译
Background samples provide key contextual information for segmenting regions of interest (ROIs). However, they always cover a diverse set of structures, causing difficulties for the segmentation model to learn good decision boundaries with high sensitivity and precision. The issue concerns the highly heterogeneous nature of the background class, resulting in multi-modal distributions. Empirically, we find that neural networks trained with heterogeneous background struggle to map the corresponding contextual samples to compact clusters in feature space. As a result, the distribution over background logit activations may shift across the decision boundary, leading to systematic over-segmentation across different datasets and tasks. In this study, we propose context label learning (CoLab) to improve the context representations by decomposing the background class into several subclasses. Specifically, we train an auxiliary network as a task generator, along with the primary segmentation model, to automatically generate context labels that positively affect the ROI segmentation accuracy. Extensive experiments are conducted on several challenging segmentation tasks and datasets. The results demonstrate that CoLab can guide the segmentation model to map the logits of background samples away from the decision boundary, resulting in significantly improved segmentation accuracy. Code is available.
translated by 谷歌翻译
Adapting object detectors learned with sufficient supervision to novel classes under low data regimes is charming yet challenging. In few-shot object detection (FSOD), the two-step training paradigm is widely adopted to mitigate the severe sample imbalance, i.e., holistic pre-training on base classes, then partial fine-tuning in a balanced setting with all classes. Since unlabeled instances are suppressed as backgrounds in the base training phase, the learned RPN is prone to produce biased proposals for novel instances, resulting in dramatic performance degradation. Unfortunately, the extreme data scarcity aggravates the proposal distribution bias, hindering the RoI head from evolving toward novel classes. In this paper, we introduce a simple yet effective proposal distribution calibration (PDC) approach to neatly enhance the localization and classification abilities of the RoI head by recycling its localization ability endowed in base training and enriching high-quality positive samples for semantic fine-tuning. Specifically, we sample proposals based on the base proposal statistics to calibrate the distribution bias and impose additional localization and classification losses upon the sampled proposals for fast expanding the base detector to novel classes. Experiments on the commonly used Pascal VOC and MS COCO datasets with explicit state-of-the-art performances justify the efficacy of our PDC for FSOD. Code is available at github.com/Bohao-Lee/PDC.
translated by 谷歌翻译